K-Graphs: Selecting Top-k Data Sources for XML Keyword Queries

نویسندگان

  • Khanh Nguyen
  • Jinli Cao
چکیده

Existing approaches on XML keyword search mostly focus on querying over single data source. However, searching over hundreds or even thousands of (distributed) data sources by sequentially querying every single data source is extremely high cost, thus it can be impractical. In this paper, we propose an approach for selecting top-k data sources to a given query in order to avoid high cost of search in numerous, potentially irrelevant data sources. The proposed approach can efficiently select top-k mostly relevant data sources without querying over the data sources. We propose a ranking function for measuring the strength of correlation between keywords in a data source and summarize the data sources as keywords correlation graphs (K-Graphs). The top-k relevant data sources will be selected by estimating the relevance of corresponding K-Graphs to the query. Experimental results show that the approach achieves good performance with a variety of experimental parameters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Processing XML Keyword Search by Constructing Effective Structured Queries

Recently, keyword search has attracted a great deal of attention in XML database. It is hard to directly improve the relevancy of XML keyword search because lots of keyword-matched nodes may not contribute to the results. To address this challenge, in this paper we design an adaptive XML keyword search approach, called XBridge, that can derive the semantics of a keyword query and generate a set...

متن کامل

Keyword Proximity Search on XML Graphs

XKeyword provides efficient keyword proximity queries on large XML graph databases. A query is simply a list of keywords and does not require any schema or query language knowledge for its formulation. XKeyword is built on a relational database and, hence, can accommodate very large graphs. Query evaluation is optimized by using the graph’s schema. In particular, XKeyword consists of two stages...

متن کامل

Efficient top-k algorithm for eXtensible Markup Language keyword search

The ability to compute top-k matches to eXtensible Markup Language (XML) queries is gaining importance owing to the increasing of large XML repositories. Current work on top-k match to XML queries mainly focuses on employing XPath, XQuery or NEXI as the query language, whereas little work has concerned on top-k match to XML keyword search. In this study, the authors propose a novel two-layer-ba...

متن کامل

Implementation of Efficient Keyword Routing in Linked Data

Keyword search is an intuitive paradigm for searching linked data sources on the web. We propose to route keywords only to relevant sources to reduce the high cost of processing keyword search queries over all sources. We propose a novel method for computing top-k routing plans based on their potentials to contain results for a given keyword query. We employ a keyword-element relationship summa...

متن کامل

SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents

Keyword search in XML documents has recently gained a lot of research attention. Given a keyword query, existing approaches first compute the lowest common ancestors (LCAs) or their variants of XML elements that contain the input keywords, and then identify the subtrees rooted at the LCAs as the answer. In this the paper we study how to use the rich structural relationships embedded in XML docu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011